Sense Embeddings in Knowledge-Based Word Sense Disambiguation

نویسندگان

  • Loïc Vial
  • Benjamin Lecouteux
  • Didier Schwab
چکیده

In this paper, we develop a new way of creating sense vectors for any dictionary, by using an existing word embeddings model, and summing the vectors of the terms inside a sense’s definition, weighted in function of their part of speech and their frequency. These vectors are then used for finding the closest senses to any other sense, thus creating a semantic network of related concepts, automatically generated. This network is hence evaluated against the existing semantic network found in WordNet, by comparing its contribution to a knowledge-based method for Word Sense Disambiguation. This method can be applied to any other language which lacks such semantic network, as the creation of word vectors is totally unsupervised, and the creation of sense vectors only needs a traditional dictionary. The results show that our generated semantic network improves greatly the WSD system, almost as much as the manually created one.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-Supervised Word Sense Disambiguation Using Word Embeddings in General and Specific Domains

One of the weaknesses of current supervised word sense disambiguation (WSD) systems is that they only treat a word as a discrete entity. However, a continuous-space representation of words (word embeddings) can provide valuable information and thus improve generalization accuracy. Since word embeddings are typically obtained from unlabeled data using unsupervised methods, this method can be see...

متن کامل

Distributional Lesk: Effective Knowledge-Based Word Sense Disambiguation

We propose a simple, yet effective, Word Sense Disambiguation method that uses a combination of a lexical knowledge-base and embeddings. Similar to the classic Lesk algorithm, it exploits the idea that overlap between the context of a word and the definition of its senses provides information on its meaning. Instead of counting the number of words that overlap, we use embeddings to compute the ...

متن کامل

Integrating WordNet for Multiple Sense Embeddings in Vector Semantics

Popular distributional approaches to semantics allow for only a single embedding of any particular word. A single embedding per word conflates the distinct meanings of the word and their appropriate contexts, irrespective of whether those usages are related or completely disjoint. We compare models that use the graph structure of the knowledge base WordNet as a post-processing step to improve v...

متن کامل

Biomedical Word Sense Disambiguation with Neural Word and Concept Embeddings

OF THESIS Biomedical Word Sense Disambiguation with Neural Word and Concept Embeddings Addressing ambiguity issues is an important step in natural language processing (NLP) pipelines designed for information extraction and knowledge discovery. This problem is also common in biomedicine where NLP applications have become indispensable to exploit latent information from biomedical literature and ...

متن کامل

Detecting Most Frequent Sense using Word Embeddings and BabelNet

Since the inception of the SENSEVAL evaluation exercises there has been a great deal of recent research into Word Sense Disambiguation (WSD). Over the years, various supervised, unsupervised and knowledge based WSD systems have been proposed. Beating the first sense heuristics is a challenging task for these systems. In this paper, we present our work on Most Frequent Sense (MFS) detection usin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017